Skip to content

Enable model-registry with UI by default#3318

Open
Raakshass wants to merge 6 commits intokubeflow:masterfrom
Raakshass:enable-model-registry-ui
Open

Enable model-registry with UI by default#3318
Raakshass wants to merge 6 commits intokubeflow:masterfrom
Raakshass:enable-model-registry-ui

Conversation

@Raakshass
Copy link
Copy Markdown
Contributor

@Raakshass Raakshass commented Jan 4, 2026

Summary of Changes

This PR enables the Model Registry server, UI, and demo catalog components in the default Kubeflow installation (example/kustomization.yaml), updates the Central Dashboard to include a Model Registry menu entry, adds README documentation, and adds CI tests with model CRUD verification.

Components added to example/kustomization.yaml:

  • Model Registry Server with PostgreSQL database (overlays/postgres)
  • Model Registry Istio networking / VirtualService (options/istio)
  • Model Registry UI with Istio integration (options/ui/overlays/istio)
  • Model Catalog demo (options/catalog/overlays/demo)

Central Dashboard:

  • Updated applications/centraldashboard/overlays/oauth2-proxy/kustomization.yaml to use istio base overlay instead of kserve
  • Added patches/configmap.yaml with Model Registry menu entry alongside existing KServe Endpoints entry

CI / Testing:

  • Added tests/model_registry_install.sh — installs Model Registry server, UI, database, Istio networking, and catalog
  • Added tests/model_registry_test.sh — CRUD tests (creates RegisteredModel, ModelVersion, ModelArtifact, verifies listing) + Istio gateway auth tests
  • Updated .github/workflows/model_registry_test.yaml to run install and test scripts

Documentation:

  • Added "Model Registry" section to README.md under "Install Individual Components"

Dependencies

No external dependencies. Uses existing upstream manifests from applications/model-registry/.

Related Issues

Closes #3047

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 4, 2026

Welcome to the Kubeflow Manifests Repository

Thanks for opening your first PR. Your contribution means a lot to the Kubeflow community.

Before making more PRs:
Please ensure your PR follows our Contributing Guide.
Please also be aware that many components are synchronizes from upstream via the scripts in /scripts.
So in some cases you have to fix the problem in the upstream repositories first, but you can use a PR against kubeflow/manifests to test the platform integration.

Community Resources:

Thanks again for helping to improve Kubeflow.

@Raakshass
Copy link
Copy Markdown
Contributor Author

hey @juliusvonkohout can you just review this pr.
Thank you

@juliusvonkohout
Copy link
Copy Markdown
Member

juliusvonkohout commented Jan 11, 2026

hey @juliusvonkohout can you just review this pr. Thank you

I am still on vacation, but maybe @tarilabs can help sooner.

Are you sure that the catalog and everything is properly exposed in the dashboard UI @Raakshass? Do you mind sharing screenshots? Think of how we expose Kserve models web application in the dashboard.

@juliusvonkohout
Copy link
Copy Markdown
Member

juliusvonkohout commented Jan 11, 2026

@Raakshass are you sure that it is properly exposed similar to the kserve models web application (endpoints) in the dashboard UI? I would like to see screenshots of the dashboard and the actual UI changes you made. Please check the original issue and related ones in the Model-Registry git repository. I think you are missing 80% of the work.

@tarilabs
Copy link
Copy Markdown
Member

could you kindly share screenshot with @ederign as Julius suggested please on this thread?

@sameerdattav
Copy link
Copy Markdown
Contributor

Hey @ederign @juliusvonkohout @tarilabs,

I’ve been following this PR and the related issue for a few days and thought I could jump in to help move things forward.
So I went ahead and opened a fresh PR that includes all the required changes along with validation screenshots:

#3323

I’d really appreciate a review when you get a chance. Thanks!

@ederign
Copy link
Copy Markdown
Member

ederign commented Jan 12, 2026

I've commented on #3323

@Raakshass
Copy link
Copy Markdown
Contributor Author

Hi @juliusvonkohout @tarilabs — addressing the feedback about showing the actual dashboard/UI change.

What changed in this update

  • Added a Central Dashboard overlay: applications/centraldashboard/overlays/model-registry/
  • Added a JSON6902 patch that appends a new dashboard menu entry pointing to /model-registry/
    • Value added: {"text": "Model Registry", "link": "/model-registry/"}

Why this change

Kubeflow’s documentation for Model Registry installation and dashboard customization indicates the Model Registry entry should be added to the Central Dashboard configuration so it appears in the sidebar menu.

Verification status

  • This PR is focused on manifests wiring (dashboard link + overlays).
  • Local end-to-end screenshots are still pending; will follow up with real deployment verification + screenshots once the deployment environment is ready.

If you’d like the menu item to also include type/icon fields (as in the docs examples), please confirm the preferred values and I can update it accordingly.

@juliusvonkohout
Copy link
Copy Markdown
Member

juliusvonkohout commented Jan 13, 2026

I think you can use a general named one called applications/centraldashboard/overlays/kustomization.yaml

We should also merge https://github.com/kubeflow/manifests/blob/master/applications/centraldashboard/overlays/oauth2-proxy/kustomization.yaml into that because oauth2-proxy is anyway mandatory.

@Raakshass
Copy link
Copy Markdown
Contributor Author

@juliusvonkohout Refactor complete!
I've consolidated the oauth2-proxy and model-registry overlays into a single applications/centraldashboard/overlays/kustomization.yaml as requested. Also switched to a Strategic Merge Patch to fix the JSON syntax error.
Screenshot 2026-01-14 001428
Dashboard link is verified locally (screenshot attached).

@Raakshass
Copy link
Copy Markdown
Contributor Author

Hi @juliusvonkohout @kimwnasptd,

I wanted to follow up on this PR. I noticed it's listed as a related issue for GSoC 2026 Project 4 (Platform Scalability and Security) - which is exciting!

Is there anything else needed from my side to move this forward? Happy to make any additional changes.

Thanks for your time!

@juliusvonkohout
Copy link
Copy Markdown
Member

@Raakshass please rebase to master, large dashboard changes have been merged.

@Raakshass Raakshass force-pushed the enable-model-registry-ui branch from d797340 to 72aa13f Compare April 9, 2026 19:46
@google-oss-prow google-oss-prow Bot added size/L and removed size/XL labels Apr 9, 2026
@Raakshass Raakshass force-pushed the enable-model-registry-ui branch 2 times, most recently from 6f14364 to 287a40e Compare April 10, 2026 06:50
- Harden AuthorizationPolicy to KFP dual-path pattern

- Add Model Registry (postgres), Istio networking, UI, and Catalog to example/kustomization.yaml

- Add Model Registry menu entry to dashboard configmap via overlay patch

- Consolidate CI Model Registry install/test into reusable scripts

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@Raakshass Raakshass force-pushed the enable-model-registry-ui branch from 287a40e to eb96333 Compare April 10, 2026 07:13
Root Cause: The demo overlay perf-data-init container uses busybox
(UID 0) which is incompatible with the base deployment runAsNonRoot
true security context, producing Init:CreateContainerConfigError.

The model_registry_test.sh does not test any catalog endpoints,
so deploying it in CI adds zero test coverage and a known failure
mode. The catalog remains in example/kustomization.yaml for user
installations.

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@juliusvonkohout
Copy link
Copy Markdown
Member

juliusvonkohout commented Apr 10, 2026

@Raakshass @tarilabs @Al-Pragliola f9c6031 is a blocker, that we should adress here.

We should not hide, but adress

# Note: Model Catalog (demo overlay) is NOT deployed in CI.
# The demo overlay's perf-data-init container (busybox, UID 0) is incompatible
# with the base deployment's runAsNonRoot: true security context. This is an
# upstream issue. The catalog is an optional demo component not covered by the
# integration tests in model_registry_test.sh, so deploying it in CI adds
# zero test coverage and a known failure mode. It remains in
# example/kustomization.yaml for user installations."

see also https://cloud-native.slack.com/archives/C073N7B6K3R/p1775814469294439

…ontainer

Apply upstream fix from kubeflow/hub#2568: add runAsNonRoot
and runAsUser: 65534 (nobody) to the perf-data-init container. The base
deployment enforces runAsNonRoot: true at pod level, but the demo
overlay init container uses busybox (UID 0), producing
Init:CreateContainerConfigError.

Re-add catalog deployment and wait blocks to model_registry_install.sh.

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@Raakshass
Copy link
Copy Markdown
Contributor Author

Raakshass commented Apr 10, 2026

@Raakshass @tarilabs @Al-Pragliola f9c6031 is a blocker, that we should adress here.

We should not hide, but adress

# Note: Model Catalog (demo overlay) is NOT deployed in CI.
# The demo overlay's perf-data-init container (busybox, UID 0) is incompatible
# with the base deployment's runAsNonRoot: true security context. This is an
# upstream issue. The catalog is an optional demo component not covered by the
# integration tests in model_registry_test.sh, so deploying it in CI adds
# zero test coverage and a known failure mode. It remains in
# example/kustomization.yaml for user installations."

see also https://cloud-native.slack.com/archives/C073N7B6K3R/p1775814469294439

root cause: the demo overlay's perf-data-init init container uses busybox (uid 0) but the base deployment enforces runAsNonRoot: true at pod level — kubernetes refuses to create it, hence Init:CreateContainerConfigError.
fixed in 391883a — applied the same fix as kubeflow/hub#2568 (added runAsNonRoot: true and runAsUser: 65534 to the init container). catalog deployment and ci wait blocks are restored.
the rayclusters error in the ci log-collection step is a pre-existing master issue — ray's
test.sh
deletes its own crds during cleanup, then kubectl get all in Collect Logs on Failure crashes on the missing crd. unrelated to model registry.

@juliusvonkohout
Copy link
Copy Markdown
Member

juliusvonkohout commented Apr 11, 2026

Now i just need some time go go thoroughly trough the authentication and authorization architecture.

@juliusvonkohout
Copy link
Copy Markdown
Member

@Raakshass please investigate #3318 (comment) and see whether you can make it more secure and add security tests for that.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment thread example/kustomization.yaml
Comment thread tests/model_registry_test.sh Outdated
…forward reuse

Add Test 8: verify unauthenticated requests (no Authorization header) are
denied by oauth2-proxy at the gateway (non-200 response).

Fix duplicate port-forward: reuse the existing localhost:8080 forward from
port_forward_gateway.sh instead of starting a conflicting second forward.

Improve gateway test comments to document the KFP dual-path
AuthorizationPolicy pattern and what each test validates.

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@Raakshass
Copy link
Copy Markdown
Contributor Author

@Raakshass please investigate #3318 (comment) and see whether you can make it more secure and add security tests for that.

@juliusvonkohout
the AuthorizationPolicy is already hardened. abdullahpathan22's comment references the original upstream permissive rules: [{}], which i replaced with the KFP dual-path pattern in a prior commit. what they suggest is exactly what we already have:

  • model-registry-service → dual-path: gateway ServiceAccount + internal K8s JWT with kubeflow-userid anti-spoof. matches ml-pipeline-ui in istio-authorization-config.yaml line-for-line
  • model-registry-ui → gateway-only: istio-ingressgateway-service-account. matches jupyter, tensorboard, volumes, kserve web apps
    security tests added in 33bbce0:
    | test | what it validates | expected |
    |------|-------------------|----------|
    | test 6 | authorized default-editor token via gateway | 200 |
    | test 7 | unauthorized default ServiceAccount token via gateway | 403 |
    | test 8 (new) | unauthenticated request, no Authorization header | non-200 |

also fixed a bug where the test script started a duplicate port-forward on 8080, conflicting with port_forward_gateway.sh which already runs earlier in the CI workflow. now reuses the existing forward.

one thing i want to confirm: the anti-spoof rule (Rule 2) blocks internal pods from sending kubeflow-userid headers without going through the gateway. testing this through the gateway doesn't work because Rule 1 matches the gateway ServiceAccount regardless of headers. a proper in-mesh spoofing test would require deploying a temporary pod in a sidecar-injected namespace and exec'ing curl directly to model-registry-service.kubeflow.svc.cluster.local. should i add that, or is gateway-level coverage sufficient here?

@juliusvonkohout
Copy link
Copy Markdown
Member

juliusvonkohout commented Apr 22, 2026

one thing i want to confirm: the anti-spoof rule (Rule 2) blocks internal pods from sending kubeflow-userid headers without going through the gateway. testing this through the gateway doesn't work because Rule 1 matches the gateway ServiceAccount regardless of headers. a proper in-mesh spoofing test would require deploying a temporary pod in a sidecar-injected namespace and exec'ing curl directly to model-registry-service.kubeflow.svc.cluster.local. should i add that, or is gateway-level coverage sufficient here?

more importantly
https://github.com/Raakshass/manifests/blob/33bbce056f367bf3c9be581f2b8e50eda6ae3944/.github/workflows/model_registry_test.yaml seems to have duplicated code, while https://github.com/kubeflow/manifests/pull/3318/changes#diff-4b30362af611e6ae2066d386c023437935b203936c034dd13c0fa669056c5175 and https://github.com/Raakshass/manifests/blob/33bbce056f367bf3c9be581f2b8e50eda6ae3944/.github/workflows/model_catalog_test.yaml is already cleaned up.

@Al-Pragliola @tarilabs please raise any objections or we merge this soon after the remaining cleanup.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Comment thread tests/model_registry_install.sh Outdated
Comment thread applications/dashboard/overlays/istio/kustomization.yaml
Comment thread tests/model_registry_test.sh Outdated
Comment thread tests/model_registry_test.sh
… feedback

Replace 80 lines of inline duplicated install/test code in
model_registry_test.yaml with calls to the consolidated scripts
(model_registry_install.sh, port_forward_gateway.sh, model_registry_test.sh),
matching the model_catalog_test.yaml pattern.

Address Copilot review comments:
- Fix pod label selector from app=model-registry-db to component=db
  (matching upstream model-registry-db-deployment.yaml)
- Update test script prerequisites to list all dependencies
  (Istio gateway, oauth2-proxy, Kubeflow profile namespace)
- Tighten Test 8 unauthenticated assertion from non-200 to
  302/401/403 only, preventing false positives from 404 routing errors

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@google-oss-prow google-oss-prow Bot added size/XL and removed size/L labels Apr 22, 2026
Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@Raakshass
Copy link
Copy Markdown
Contributor Author

one thing i want to confirm: the anti-spoof rule (Rule 2) blocks internal pods from sending kubeflow-userid headers without going through the gateway. testing this through the gateway doesn't work because Rule 1 matches the gateway ServiceAccount regardless of headers. a proper in-mesh spoofing test would require deploying a temporary pod in a sidecar-injected namespace and exec'ing curl directly to model-registry-service.kubeflow.svc.cluster.local. should i add that, or is gateway-level coverage sufficient here?

more importantly https://github.com/Raakshass/manifests/blob/33bbce056f367bf3c9be581f2b8e50eda6ae3944/.github/workflows/model_registry_test.yaml seems to have duplicated code, while https://github.com/kubeflow/manifests/pull/3318/changes#diff-4b30362af611e6ae2066d386c023437935b203936c034dd13c0fa669056c5175 and https://github.com/Raakshass/manifests/blob/33bbce056f367bf3c9be581f2b8e50eda6ae3944/.github/workflows/model_catalog_test.yaml is already cleaned up.

@Al-Pragliola @tarilabs please raise any objections or we merge this soon after the remaining cleanup.

cleaned up
model_registry_test.yaml
in 15cd3e7 — replaced all inline duplicated install/deploy/test code with three script calls matching the
model_catalog_test.yaml
pattern:

- name: Install Model Registry
  run: ./tests/model_registry_install.sh
- name: Port forward Istio gateway
  run: ./tests/port_forward_gateway.sh
- name: Run Model Registry Tests
  run: ./tests/model_registry_test.sh

also addressed all copilot review feedback in the same commit and fixed an AP abbreviation in 1e76409.

on security: the AuthorizationPolicy already implements the KFP dual-path pattern (gateway ServiceAccount + internal K8s JWT with kubeflow-userid anti-spoof) matching ml-pipeline-ui line-for-line. tests validate authorized (200), unauthorized (403), and unauthenticated (302/401/403) access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable model-registry with UI by default

8 participants